An Articulatory-Based Singing Voice Synthesis Using Tongue and Lips Imaging

نویسندگان

Aurore Jaumard-Hakoun

Kele Xu

Clémence Leboullenger

Pierre Roussel-Ragot

Bruce Denby

چکیده

Ultrasound imaging of the tongue and videos of lips movements can be used to investigate specific articulation in speech or singing voice. In this study, tongue and lips image sequences recorded during singing performance are used to predict vocal tract properties via Line Spectral Frequencies (LSF). We focused our work on traditional Corsican singing “Cantu in paghjella”. A multimodal Deep Autoencoder (DAE) extracts salient descriptors directly from tongue and lips images. Afterwards, LSF values are predicted from the most relevant of these features using a multilayer perceptron. A vocal tract model is derived from the predicted LSF, while a glottal flow model is computed from a synchronized electroglottographic recording. Articulatory-based singing voice synthesis is developed using both models. The quality of the prediction and singing voice synthesis using this method outperforms the state of the art method.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

APEX - an articulatory model for speech and singing

The APEX articulatory synthesis model is being developed as a joint project at the Department of Speech, Music and Hearing at the Royal Institute of Technology and at the Department of Linguistics at Stockholm University. It is a direct development of an earlier vowel model [1], implemented as a computer program under Windows [2]. It calculates formants and produces sound according to articulat...

متن کامل

Across-speaker articulatory normalization for speaker-independent silent speech recognition

Silent speech interfaces (SSIs), which recognize speech from articulatory information (i.e., without using audio information), have the potential to enable persons with laryngectomy or a neurological disease to produce synthesized speech with a natural sounding voice using their tongue and lips. Current approaches to SSIs have largely relied on speaker-dependent recognition models to minimize t...

متن کامل

Synergy between jaw and lips/tongue movements : consequences in articulatory modelling

CONSEQUENCES IN ARTICULATORY MODELLING Gérard Bailly, Pierre Badin & Anne Vilain Institut de la Communication Parlée INPG & Université Stendhal 46, avenue Félix Viallet, 38031 Grenoble Cedex 1, France Abstract Linear component articulatory models [9, 10, 5] are built using an iterative substraction of linear predictors of the vocal tract geometry. In this paper we consider the contribution of j...

متن کامل

Towards an Audiovisual Virtual Talking Head: 3d Articulatory Modeling of Tongue, Lips and Face Based on Mri and Video Images

A linear three-dimensional articulatory model of tongue, lips and face is presented. The model is based on a linear component analysis of the 3D coordinates defining the geometry of the different organs, obtained from Magnetic Resonance Imaging of the tongue, and from front and profile video images of the subject’s face marked with small beads. In addition to a common jaw height parameter, the ...

متن کامل

Articulatory synthesis from x-rays and inversion for an adaptive speech robot

This paper describes a speech robotic approach to articulatory synthesis. An anthropomorphic speech robot has been built, based on a real reference subject’s data. This speech robot, called the Articulotron, has a set of relevant degrees of freedom for speech articulators, jaw, tongue, lips, and larynx. The associated articulatory model has been elaborated from cineradiographic midsagittal prof...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2016

An Articulatory-Based Singing Voice Synthesis Using Tongue and Lips Imaging

نویسندگان

چکیده

منابع مشابه

APEX - an articulatory model for speech and singing

Across-speaker articulatory normalization for speaker-independent silent speech recognition

Synergy between jaw and lips/tongue movements : consequences in articulatory modelling

Towards an Audiovisual Virtual Talking Head: 3d Articulatory Modeling of Tongue, Lips and Face Based on Mri and Video Images

Articulatory synthesis from x-rays and inversion for an adaptive speech robot

عنوان ژورنال:

اشتراک گذاری